AITopics | serverless platform

Collaborating Authors

serverless platform

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

eb3c8135137c8a60425a0320869ad87e-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 15:50:58 GMT

Recently, reinforcement learning (RL) based approaches have attracted increasing attention for dynamic resource management asRLhelpsautomatically adapttoaspecific userworkload.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.05)
Asia > Middle East > Jordan (0.04)
North America > United States > District of Columbia > Washington (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)

Add feedback

A Mean-Field Game Approach to Cloud Resource Management with Function Approximation

Neural Information Processing SystemsDec-25-2025, 15:18:15 GMT

Reinforcement learning (RL) has gained increasing popularity for resource management in cloud services such as serverless computing. As self-interested users compete for shared resources in a cluster, the multi-tenancy nature of serverless platforms necessitates multi-agent reinforcement learning (MARL) solutions, which often suffer from severe scalability issues. In this paper, we propose a mean-field game (MFG) approach to cloud resource management that is scalable to a large number of users and applications and incorporates function approximation to deal with the large state-action spaces in real-world serverless platforms. Specifically, we present an online natural actor-critic algorithm for learning in MFGs compatible with various forms of function approximation. We theoretically establish its finite-time convergence to the regularized Nash equilibrium under linear function approximation and softmax parameterization. We further implement our algorithm using both linear and neural-network function approximations, and evaluate our solution on an open-source serverless platform, OpenWhisk, with real-world workloads from production traces. Experimental results demonstrate that our approach is scalable to a large number of users and significantly outperforms various baselines in terms of function latency and resource utilization efficiency.

cloud resource management, function approximation, mean-field game approach, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.83)

Add feedback

eb3c8135137c8a60425a0320869ad87e-Paper-Conference.pdf

Neural Information Processing SystemsAug-19-2025, 16:32:17 GMT

cloud computing, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > District of Columbia > Washington (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Cloud Computing (0.94)

Add feedback

A Mean-Field Game Approach to Cloud Resource Management with Function Approximation

Neural Information Processing SystemsJan-19-2025, 05:30:47 GMT

cloud resource management, function approximation, mean-field game approach, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)

Add feedback

Optimizing Distributed Deployment of Mixture-of-Experts Model Inference in Serverless Computing

Liu, Mengfan, Wang, Wei, Wu, Chuan

arXiv.org Artificial IntelligenceJan-9-2025

With the advancement of serverless computing, running machine learning (ML) inference services over a serverless platform has been advocated, given its labor-free scalability and cost effectiveness. Mixture-of-Experts (MoE) models have been a dominant type of model architectures to enable large models nowadays, with parallel expert networks. Serving large MoE models on serverless computing is potentially beneficial, but has been underexplored due to substantial challenges in handling the skewed expert popularity and scatter-gather communication bottleneck in MoE model execution, for cost-efficient serverless MoE deployment and performance guarantee. We study optimized MoE model deployment and distributed inference serving on a serverless platform, that effectively predict expert selection, pipeline communication with model execution, and minimize the overall billed cost of serving MoE models. Especially, we propose a Bayesian optimization framework with multi-dimensional epsilon-greedy search to learn expert selections and optimal MoE deployment achieving optimal billed cost, including: 1) a Bayesian decision-making method for predicting expert popularity; 2) flexibly pipelined scatter-gather communication; and 3) an optimal model deployment algorithm for distributed MoE serving. Extensive experiments on AWS Lambda show that our designs reduce the billed cost of all MoE layers by at least 75.67% compared to CPU clusters while maintaining satisfactory inference throughput. As compared to LambdaML in serverless computing, our designs achieves 43.41% lower cost with a throughput decrease of at most 18.76%.

inference, moe layer, serverless function, (15 more...)

arXiv.org Artificial Intelligence

2501.05313

Country: Asia > China > Hong Kong (0.04)

Genre: Research Report (0.50)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Add feedback

Transactions and Serverless are Made for Each Other

Communications of the ACMNov-20-2024, 20:40:06 GMT

Serverless cloud offerings are becoming increasingly popular for stateless applications because they simplify cloud deployment. This article argues that if serverless platforms could wrap functions in database transactions, they would also be a good fit for database-backed applications. There are two unique benefits of such a transactional serverless platform: time-travel debugging of past events and reliable program execution with "exactly-once" semantics. Serverless cloud platforms such as Amazon Web Services (AWS) Lambda and Azure Functions are increasingly popular for building production applications as varied as website front ends, machine-learning (ML) pipelines, and image-processing systems. These platforms radically simplify development by managing application deployment.

application, platform, serverless platform, (14 more...)

Communications of the ACM

Genre: Research Report (0.37)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Cloud Computing (0.72)
Information Technology > Sensing and Signal Processing > Image Processing (0.57)
Information Technology > Artificial Intelligence > Machine Learning (0.57)

Add feedback

AI-based Resource Allocation: Reinforcement Learning for Adaptive Auto-scaling in Serverless Environments

Schuler, Lucia, Jamil, Somaya, Kühl, Niklas

arXiv.org Artificial IntelligenceMay-29-2020

Serverless computing has emerged as a compelling new paradigm of cloud computing models in recent years. It promises the user services at large scale and low cost while eliminating the need for infrastructure management. On cloud provider side, flexible resource management is required to meet fluctuating demand. It can be enabled through automated provisioning and deprovisioning of resources. A common approach among both commercial and open source serverless computing platforms is workload-based auto-scaling, where a designated algorithm scales instances according to the number of incoming requests. In the recently evolving serverless framework Knative a request-based policy is proposed, where the algorithm scales resources by a configured maximum number of requests that can be processed in parallel per instance, the so-called concurrency. As we show in a baseline experiment, this predefined concurrency level can strongly influence the performance of a serverless application. However, identifying the concurrency configuration that yields the highest possible quality of service is a challenging task due to various factors, e.g. varying workload and complex infrastructure characteristics, influencing throughput and latency. While there has been considerable research into intelligent techniques for optimizing auto-scaling for virtual machine provisioning, this topic has not yet been discussed in the area of serverless computing. For this reason, we investigate the applicability of a reinforcement learning approach, which has been proven on dynamic virtual machine provisioning, to request-based auto-scaling in a serverless framework. Our results show that within a limited number of iterations our proposed model learns an effective scaling policy per workload, improving the performance compared to the default auto-scaling configuration.

machine learning, reinforcement learning, workload, (20 more...)

arXiv.org Artificial Intelligence

2005.1441

Country: Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Services (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Finding photos on Twitter using face recognition with TensorFlow.js

#artificialintelligenceMay-12-2020, 10:12:03 GMT

As a developer advocate, I spend a lot of time at developer conferences (talking about serverless). Upon returning from each trip, I need to compile a "trip report" on the event for my bosses. This helps demonstrate the value in attending events and that I'm not just accruing air miles and hotel points for fun… I always include any social media content people post about my talks in the trip report. This is usually tweets with photos of me on stage. If people are tweeting about your session, I assume they enjoyed it and wanted to share with their followers.

application, artificial intelligence, social media, (16 more...)

#artificialintelligence

Industry: Information Technology (0.47)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)

Add feedback

How to run Tensorflow.js on a serverless platform : deploying models

#artificialintelligenceMay-7-2020, 17:08:29 GMT

This is the last part of a 3 articles serie. In the first part, we introduced neural networks and TensorFlow framework basics. In the second part, we explained how to convert existing models from Python to TensorFlow.js Finally, we present today, through an example, how to use an online TensorFlow.js model and deploy it rapidly using our WarpJS JavaScript Serverless Function-as-a-Service (FaaS). Many public models can be retrieved from web databases.

artificial intelligence, machine learning, tensorflow, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The Rise of Serverless Computing

Communications of the ACMDec-6-2019, 03:30:53 GMT

Cloud computing in general, and Infrastructure-as-a-Service (IaaS) in particular, have become widely accepted and adopted paradigms for computing with the offerings of virtual machines (VM) on demand. By 2020, 67% of enterprise IT infrastructure and software spending will be for cloud-based offerings.16 A major factor in the increased adoption of the cloud by enterprise IT was its pay-as-you-go model where a customer pays only for resources leased from the cloud provider and have the ability to get as many resources as needed with no up-front cost (elasticity).2 Unfortunately, the burden of scaling was left for developers and system designers that typically used overprovisioning techniques to handle sudden surges in service requests. Studies of reported usage of cloud resources in datacenters19 show a substantial gap between the resources that cloud customers allocate and pay for (leasing VMs), and actual resource utilization (CPU, memory, and so on). Serverless computing is emerging as a new and compelling paradigm for the deployment of cloud applications, largely due to the recent shift of enterprise application architectures to containers and microservices.23 Using serverless gives pay-as-you-go without additional work to start and stop server and is closer to original expectations for cloud computing to be treated like as a utility.2 Developers using serverless computing can get cost savings and scalability without needing to havea high level of cloud computing expertise that is time-consuming to acquire. Due to its simplicity and economical advantages, serverless computing is gaining popularity as reported by the increasing rate of the "serverless" search term by Google Trends. Its market size is estimated to grow to 7.72 billion by 2021.10 Most prominent cloud providers including Amazon, IBM, Microsoft, Google, and others have already released serverless computing capabilities with several additional open source efforts driven by both industry and academic institutions (for example, see CNCF Serverless Cloud Native Landscapea).

application, computing, serverless computing, (12 more...)

Communications of the ACM

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(3 more...)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Communications > Web (0.46)
(2 more...)

Add feedback